首页> 外文OA文献 >Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition
【2h】

Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition

机译:利用生成对抗网络探索语音增强   强大的语音识别

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We investigate the effectiveness of generative adversarial networks (GANs)for speech enhancement, in the context of improving noise robustness ofautomatic speech recognition (ASR) systems. Prior work demonstrates that GANscan effectively suppress additive noise in raw waveform speech signals,improving perceptual quality metrics; however this technique was not justifiedin the context of ASR. In this work, we conduct a detailed study to measure theeffectiveness of GANs in enhancing speech contaminated by both additive andreverberant noise. Motivated by recent advances in image processing, we proposeoperating GANs on log-Mel filterbank spectra instead of waveforms, whichrequires less computation and is more robust to reverberant noise. While GANenhancement improves the performance of a clean-trained ASR system on noisyspeech, it falls short of the performance achieved by conventional multi-styletraining (MTR). By appending the GAN-enhanced features to the noisy inputs andretraining, we achieve a 7% WER improvement relative to the MTR system.
机译:我们在提高自动语音识别(ASR)系统的噪声鲁棒性的背景下,研究了生成对抗网络(GAN)进行语音增强的有效性。先前的工作表明,GANscan可以有效地抑制原始波形语音信号中的附加噪声,从而改善感知质量指标;然而,这种技术在ASR的背景下是不合理的。在这项工作中,我们进行了详细的研究,以测量GAN在增强被加性和混响性噪声污染的语音中的有效性。基于图像处理的最新进展,我们建议在对数梅尔滤波器组频谱而不是波形上运行GAN,这需要较少的计算,并且对混响噪声更加鲁棒。虽然GANenhancement改善了noisyspeech上经过纯净训练的ASR系统的性能,但仍不及传统的多样式训练(MTR)所实现的性能。通过将GAN增强功能附加到嘈杂的输入和再训练,相对于MTR系统,我们将WER提高了7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号